54 research outputs found

    Numerical studies of space filling designs: optimization of Latin Hypercube Samples and subprojection properties

    Get PDF
    International audienceQuantitative assessment of the uncertainties tainting the results of computer simulations is nowadays a major topic of interest in both industrial and scientific communities. One of the key issues in such studies is to get information about the output when the numerical simulations are expensive to run. This paper considers the problem of exploring the whole space of variations of the computer model input variables in the context of a large dimensional exploration space. Various properties of space filling designs are justified: interpoint-distance, discrepancy, minimum spanning tree criteria. A specific class of design, the optimized Latin Hypercube Sample, is considered. Several optimization algorithms, coming from the literature, are studied in terms of convergence speed, robustness to subprojection and space filling properties of the resulting design. Some recommendations for building such designs are given. Finally, another contribution of this paper is the deep analysis of the space filling properties of the design 2D-subprojections

    Reconnaissance d'opérations d'algèbre linéaire dans un programme polyédrique

    Get PDF
    Writing a code which uses an architecture at its full capability has become an increasingly difficult problem over the last years. For some key operations, a dedicated accelerator or a finely tuned implementation exists and delivers the best performance. Thus, when compiling a code, identifying these operations and issuing calls to their high-performance implementation is attractive. In this dissertation, we focus on the problem of detection of these operations. We propose a framework which detects linear algebra subcomputations within a polyhedral program. The main idea of this framework is to partition the computation in order to isolate different subcomputations in a regular manner, then we consider each portion of the computation and try to recognize it as a combination of linear algebra operations.We perform the partitioning of the computation by using a program transformation called monoparametric tiling. This transformation partitions the computation into blocks, whose shape is some homothetic scaling of a fixed-size partitioning. We show that the tiled program remains polyhedral while allowing a limited amount of parametrization: a single size parameter. This is an improvement compared to the previous work on tiling, that forced us to choose between these two properties.Then, in order to recognize computations, we introduce a template recognition algorithm. This template recognition algorithm is built on a state-of-the-art program equivalence algorithm. We also propose several extensions in order to manage some semantic properties.Finally, we combine these two previous contributions into a framework which detects linear algebra subcomputations. A part of this framework is a library of template, based on the BLAS specification. We demonstrate our framework on several applications.Durant ces dernières années, Il est de plus en plus compliqué d'écrire du code qui utilise une architecture au mieux de ses capacités. Certaines opérations clefs ont soit un accélérateur dédié, ou admettent une implémentation finement optimisée qui délivre les meilleurs performances. Ainsi, il est intéressant d'identifier ces opérations pendant la compilation d'un programme, et de faire appel à une implémentation optimisée.Nous nous intéressons dans cette thèse au problème de détection de ces opérations. Nous proposons un procédé qui détecte des sous-calculs correspondant à des opérations d'algèbre linéaire à l'intérieur de programmes polyédriques. L'idée principale de ce procédé est de découper le programme en sous-calculs isolés, et essayer de reconnaître chaque sous-calculs comme une combinaison d'opérateurs d'algèbre linéaire.Le découpage du calcul est effectué en utilisant une transformation de programme appelée tuilage monoparamétrique. Cette transformation partitionne le calcul en tuiles dont la forme est un agrandissement paramétrique d'une tuile de taille constante. Nous montrons que le programme tuilé reste polyédrique tout en permettant une paramétrisation limitée des tailles de tuile. Les travaux précédents sur le tuilage nous forçaient à choisir l'une de ces deux propriétés.Ensuite, afin d'identifier les opérateurs, nous introduisons un algorithme de reconnaissance de template, qui est une extension d'un algorithme d'équivalence de programme. Nous proposons plusieurs extensions afin de tenir compte des propriétés sémantiques communément rencontrées en algèbre linéaire.Enfin, nous combinons les deux contributions précédentes en un procédé qui détecte les sous-calculs correspondant à des opérateurs d'algèbre linéaire. Une de ses composantes est une librairie de template, inspirée de la spécification BLAS. Nous démontrons l'efficacité de notre procédé sur plusieurs applications

    Numerical computation of travelling breathers in Klein-Gordon chains

    Get PDF
    We numerically study the existence of travelling breathers in Klein-Gordon chains, which consist of one-dimensional networks of nonlinear oscillators in an anharmonic on-site potential, linearly coupled to their nearest neighbors. Travelling breathers are spatially localized solutions having the property of being exactly translated by pp sites along the chain after a fixed propagation time TT (these solutions generalize the concept of solitary waves for which p=1p=1). In the case of even on-site potentials, the existence of small amplitude travelling breathers superposed on a small oscillatory tail has been proved recently (G. James and Y. Sire, to appear in {\sl Comm. Math. Phys.}, 2004), the tail being exponentially small with respect to the central oscillation size. In this paper we compute these solutions numerically and continue them into the large amplitude regime for different types of even potentials. We find that Klein-Gordon chains can support highly localized travelling breather solutions superposed on an oscillatory tail. We provide examples where the tail can be made very small and is difficult to detect at the scale of central oscillations. In addition we numerically observe the existence of these solutions in the case of non even potentials

    Semantic Tiling

    Get PDF
    International audienc

    Semantic Tiling

    Get PDF
    International audienc

    Le tuilage mono-paramétrique est une transformation polyédrique

    Get PDF
    Tiling is a crucial program transformation with many benefits: it improves locality, exposes parallelism, allows for adjusting the ops-to-bytes balance of codes, and can be applied at multiple levels. Allowing tile sizes to be symbolic parameters at compile time has many benefits, including efficient autotuning, and run-time adaptability to system variations. For polyhedral programs, parametric tiling in its full generality is known to be non-linear, breaking the mathematical closure properties of the polyhedral model. Most compilation tools therefore either avoid it by only performing fixed size tiling, or apply it in only the final, code generation step. Both strategies have limitations. We first introduce mono-parametric partitioning, a restricted parametric, tiling-like transformation which can be used to express a tiling. We show that, despite being parametric, it is a polyhedral transformation. We first prove that applying mono-parametric partitioning (i) to a polyhedron yields a union of polyhedra, and (ii) to an affine function produces a piecewise-affine function. We then use these properties to show how to partition an entire polyhedral program, including one with reductions. Next, we generalize this transformation to tiles with arbitrary tile shapes that can tesselate the iteration space (e.g., hexagonal, trapezoidal, etc). We show how mono-parametric tiling can be applied at multiple levels, and enables a wide range of polyhedral analysis and transformations to be applied

    From micro-OPs to abstract resources: constructing a simpler CPU performance model through microbenchmarking

    Get PDF
    This paper describes PALMED, a tool that automatically builds a resource mapping, a performance model for pipelined, super-scalar, out-of-order CPU architectures. Resource mappings describe the execution of a program by assigning instructions in the program to abstract resources. They can be used to predict the throughput of basic blocks or as a machine model for the backend of an optimizing compiler. PALMED does not require hardware performance counters, and relies solely on runtime measurements to construct resource mappings. This allows it to model not only execution port usage, but also other limiting resources, such as the frontend or the reorder buffer. Also, thanks to a dual representation of resource mappings, our algorithm for constructing mappings scales to large instruction sets, like that of x86. We evaluate the algorithmic contribution of the paper in two ways. First by showing that our approach can reverse engineering an accurate resource mapping from an idealistic performance model produced by an existing port-mapping. We also evaluate the pertinence of our dual representation, as opposed to the standard port-mapping, for throughput modeling by extracting a representative set of basic-blocks from the compiled binaries of the Spec CPU 2017 benchmarks and comparing the throughput predicted by existing machine models to that produced by PALMED

    Bifurcations of discrete breathers in a diatomic Fermi-Pasta-Ulam chain

    Full text link
    Discrete breathers are time-periodic, spatially localized solutions of the equations of motion for a system of classical degrees of freedom interacting on a lattice. Such solutions are investigated for a diatomic Fermi-Pasta-Ulam chain, i. e., a chain of alternate heavy and light masses coupled by anharmonic forces. For hard interaction potentials, discrete breathers in this model are known to exist either as ``optic breathers'' with frequencies above the optic band, or as ``acoustic breathers'' with frequencies in the gap between the acoustic and the optic band. In this paper, bifurcations between different types of discrete breathers are found numerically, with the mass ratio m and the breather frequency omega as bifurcation parameters. We identify a period tripling bifurcation around optic breathers, which leads to new breather solutions with frequencies in the gap, and a second local bifurcation around acoustic breathers. These results provide new breather solutions of the FPU system which interpolate between the classical acoustic and optic modes. The two bifurcation lines originate from a particular ``corner'' in parameter space (omega,m). As parameters lie near this corner, we prove by means of a center manifold reduction that small amplitude solutions can be described by a four-dimensional reversible map. This allows us to derive formally a continuum limit differential equation which characterizes at leading order the numerically observed bifurcations.Comment: 30 pages, 10 figure

    PALMED: Throughput Characterization for Superscalar Architectures

    Get PDF
    International audienceIn a super-scalar architecture, the scheduler dynamically assigns micro-operations (µOPs) to execution ports. The port mapping of an architecture describes how an instruction decomposes into µOPs and lists for each µOP the set of ports it can be mapped to. It is used by compilers and performance debugging tools to characterize the performance throughput of a sequence of instructions repeatedly executed as the core component of a loop. This paper introduces a dual equivalent representation: The resource mapping of an architecture is an abstract model where, to be executed, an instruction must use a set of abstract resources, themselves representing combinations of execution ports. For a given architecture, finding a port mapping is an important but difficult problem. Building a resource mapping is a more tractable problem and provides a simpler and equivalent model. This paper describes Palmed, a tool that automatically builds a resource mapping for pipelined, super-scalar, out-of-order CPU architectures. Palmed does not require hardware performance counters, and relies solely on runtime measurements. We evaluate the pertinence of our dual representation for throughput modeling by extracting a representative set of basic-blocks from the compiled binaries of the SPEC CPU 2017 benchmarks. We compared the throughput predicted by existing machine models to that produced by Palmed, and found comparable accuracy to state-of-the art tools, achieving sub-10 % mean square error rate on this workload on Intel's Skylake microarchitecture

    Automatic Parallelization from Lustre Models in Avionics

    Get PDF
    International audienceThis poster presents ongoing research on automatic generation and execution of embedded parallel C code. We target safety-critical avionics programs specified in the synchronous language Lustre. The work described is part of the ITEA 3 project ASSUME (September 2015 - August 2018). ASSUME focuses mainly on embedded software engineering for multi-/many-core platforms. Both synthesis, e.g., automatic code generation, and verification, e.g., static analysis, of programs are addressed in the project. ASSUME is driven by the use cases of its industrial partners. One of these use cases consists in the parallelization of an avionics application comprising about 5500 Lustre nodes. After an overview of the ASSUME project, both parallel code generation and execution on a many-core platform will be presented and demonstrated
    • …
    corecore